Data Storage and File Compression (Lossy & Lossless)
Objectives : Student should be able to -
Measurement of data storage
Q1. Describe the following Basic Units of measuring computer memory storage.
(i) Bit :
⇒ Bit stands for binary digit. The basic unit of data storage in computer memory.
⇒ It could be either 1 or 0, that represent ON or OFF state of electrical signal, which is the only thing a computer can understand.
(ii) Byte :
⇒ A group of 8 bits is called Byte.
⇒ Byte is the smallest unit of memory in a computer.
⇒ Processors are all built to work with a set length of bits, which is usually a multiple of byte, like 8, 16, 32, 64 , etc.
(iii) Nibble :
⇒ A group of 4 bits is called Nibble, which is equal to half a byte.
⇒ A Nibble could represent 24 = 16 possible values. Hence, each hexadecimal digit is represented by a nibble.
Q2. Give different units of measuring computer memory in terms of bits and bytes, adopted by the IEC (International Electrotechnical Commission) that is based on the binary system.
Unit of Memory |
Bits / Bytes |
Bytes |
General Name |
1 Nibble | 22 = 4 bits | - - - - - | Nibble |
1 Byte | 23 = 8 bits | 1 Byte | Byte |
1 KiB (Kibibyte) | 210 Bytes | 1024 Bytes | Kilo Byte |
1 MiB (Mebibyte) | 220 Bytes | 1024 KiB | Mega Byte |
1 GiB (Gibibyte) | 230 Bytes | 1024 MiB | Giga Byte |
1 TiB (Tebibyte) | 240 Bytes | 1024 GiB | Tera Byte |
1 PiB (Pebibyte) | 250 Bytes | 1024 TiB | Peta Byte |
1 EiB (Exbibyte) | 260 Bytes | 1024 PiB | Exa Byte |
1 ZiB (Zebibyte) | 270 Bytes | 1024 EiB | Zetta Byte |
1 YiB (Yobibyte) | 280 Bytes | 1024 ZiB | Yotta Byte |
Note : The International System of Units (SI) used around the world adopts the decimal system of measurement (in terms of powers of 10).
In SI units, 1 Kilobyte (KB) = 1000 Bytes, 1 Megabyte (MB) = 1000 KB, 1 Gigabyte (GB) = 1000 MB, 1 Terabyte (TB) = 1000 GB, which is technically inaccurate because computer memory size is actually measured in terms of powers of 2.
Calculation of file size
To calculate the file size of a bitmap image :
Size of image file (in bits) | = | Image resolution (in pixels) x Colour depth (in bits) | |
= | Width (in pixels) x Height (in pixels) x Colour depth (in bits) | ||
To calculate the file size of a Mono-sound :
Size of sound file (in bits) | = | Sample rate (in Hz) x Sample resolution (in bits) x Length of sound (in seconds) |
To calculate the file size of a Stereo-sound (2 channels of audio) :
Stereo is the sound recorded with two microphones placed in strategically chosen locations relative to the sound source, and played back through two channels (speakers). The two simultaneously recorded channels will be similar, but each will have distinct time-of-arrival and sound-pressure-level information.
Size of sterio-sound file (in bits) |
= | Sample rate (in Hz) x Sample resolution (in bits) x Length of sound (in seconds) x 2 (number of channels) |
To convert the size of file from bits to bytes , kilobytes , megabytes , etc. :
Size of file (in bytes) | = | bytes | |
Since, 1 byte = 8 bits | 8 | ||
Size of file (in Kibibytes) | = | KiB | |
Since, 1 KiB = 1024 bytes | 8 x 1024 | ||
Size of file (in Mebibytes) | = | MiB | |
Since, 1 MiB = 1024 KB | 8 x 1024 x 1024 |
Q3. a) Calculate the file size of an image with 8 colours, captured at a resolution of 512 x 300 pixel.
8 Colour = 23 Colour
Hence, Colour depth = 3 bits. (i.e. number of bits needed to represent 8 colours)
Size of image file (in bits) | = | Image resolution x colour depth | ||||
= | 512 x 300 x 3 bits | |||||
Size of image file (in bytes) | = | = 57600 bytes | ||||
Divide by 8, since 1 byte = 8 bits | 8 | |||||
Size of image file (in KiB) | = | (Or) | approximately | |||
since 1 KiB = 1024 bytes | 8 x 1024 | 8 x 1000 | ||||
(1000 bytes approximately) | ||||||
= | 56.25 KiB | (Or) | 57.6 KB | approximately |
b) What would be the file size of the image if it is converted into Black and White.
Black and White means 2 colour = 21 Colour
Hence, Colour depth = 1 bits. (i.e. number of bits needed to represent 2 colours)
Size of image file (in bits) | = | Image resolution x colour depth | ||||
= | 512 x 300 x 1 bits | |||||
Size of image file (in bytes) | = | = 19200 bytes | ||||
Divide by 8, since 1 byte = 8 bits | 8 | |||||
Size of image file (in KiB) | = | (Or) | approximately | |||
since 1 KiB = 1024 bytes | 8 x 1024 | 8 x 1000 | ||||
(1000 bytes approximately) | ||||||
= | 18.75 KiB | (Or) | 19.2 KB | approximately |
Q4. A camera detector has an array of 1920 by 1536 pixels. A colour depth of 16 bits is used.
Calculate the size of a photograph taken by this camera, giving your answer in MiB.
Size of image file (in bits) | = | Image resolution x colour depth | ||||
= | 1920 x 1536 x 16 bits | |||||
Size of image file (in bytes) | = | = 5898240 bytes | ||||
Divide by 8, since 1 byte = 8 bits | 8 | |||||
Size of image file (in MiB) | = | |||||
since 1 MiB = 1024 x 1024 bytes | 8 x 1024 x 1024 | |||||
= | 5.625 MiB |
Q5. Photographs have been taken by a smartphone which uses a detector with a 1024 x 1536 pixel array. The software uses a colour depth of 24 bits.
How many photographs could be stored on a 16 GiB memory card?
Size of each photo (in bits) | = | Image resolution x colour depth | ||||
= | 1024 x 1536 x 24 bits | |||||
Size of each photo (in bytes) | = | bytes | ||||
Divide by 8, since 1 byte = 8 bits | 8 | |||||
16 GiB (in bytes) | = | 16 x 1024 x 1024 x 1024 bytes | ||||
Number of photos on 16 GiB | = | |||||
Size of one photo in bytes | ||||||
Number of photos on 16 GiB | = | = 3640.89 | ||||
1024 x 1536 x 24 | ||||||
= | 3640 Photos |
Q6. A five minute audio is sampled at 44.1kHz per second with a 16 bit resolution.
Calculate the bit rate and size of the audio file.
Bit rate (in bps) | = | Sample rate (Hz) x Sample resolution | |||||
Bit rate (in kbps) | = | = 705.6 kbps | |||||
Since 1 kbps = 1000 bits/sec | 1000 | ||||||
File size (in bits) | = | Sample rate (Hz) x Sample resolution x Length of sound (in Sec) | |||||
File size (in MiB) | = | = 25.23 MiB | |||||
8 x 1024 x 1024 |
Q7. A 30 second audio is being sampled at the rate of 44.1kHz using 8-bits. Two channels are being used to allow for stereo recording.
Calculate :
a) the size of one sample, in bits.
Size of one sample (in bps) | = | Sample rate (Hz) x Sample resolution x Length of audio (in Sec) |
= | 44100 x 8 x 30 = 10584000 bits. |
b) the size of audio recording in MiB.
File size of stereo recording (in bps) | = | Size of one sample (in bits) x 2 Channels | ||
= | 10584000 x 2 = 21168000 bits | |||
File size of stereo recording (in MiB) | = | = | 2.52 MiB | |
8 x 1024 x 1024 | ||||
Q8. The typical song stored on a music CD is 3 minutes and 30 seconds. Assuming each song is sampled at 44.1 kHz and 16 bits are used per sample. Each song utilises two channels.
Calculate how many typical songs could be stored on a 740 MiB CD.
Size of one sample (in bps) | = | Sample rate (Hz) x Sample resolution x Length of audio (in Sec) | ||
= | 44100 x 16 x (3 x 60 + 30) | |||
= | 44100 x 16 x 210 = 148176000 bits. | |||
File size of stereo (in bps) | = | Size of one sample (in bits) x 2 Channels | ||
= | 148176000 x 2 = 296352000 bits | |||
File size of stereo (in bytes) | = | = | 37044000 bytes | |
8 | ||||
Number of songs on 740 MiB | = | |||
Size of a sterio (in bytes) | ||||
= | = | 20.95 | ||
37044000 | ||||
= | 20 songs could be stored on a 740 MiB CD. |
Data Compression
Q9. a) State what is meant by Data / File Compression
b) Why is it necessary to compress files?
Q10. a) Describe how Lossless compression reduces the file size.
b) Explain how program file, text or document files could be compressed.
Q11. Explain how the sentence below would be stored with a reduction of about 40% (ignoring spaces).
“COMPARE TEXT FILES IN A COMPUTER AFTER FILE COMPRESSION”
“1ARE TEXT 2S IN A 1U3 AF3 2 1RESSION”
Q12. a) Describe how Run Length Encoding (RLE) algorithm is used to reduce the file size.
⇒ Run Length Encoding is a lossless data compression algorithm.
⇒ It reduces the size of a string with consecutive identical data (e.g. repeated character of text or pixels of an image).
⇒ A repeating characters in the string is encoded with two values :
- the first value represents the number of repetitions or identical data items in the run.
- the second value represents the code of the repeated data item (ASCII code for characters, Pixel detail for an image)
⇒ RLE is only effective where there is a long run of repeated units of data.
b) Explain how the size of the following text string could be reduced using Run Length Encoding (RLE) algorithm.
'a a a a a b b b b c c d d d d d'
5 97 4 98 2 99 5 100
Where, in each pair, the first values 5, 4, 2 and 5 are number of runs (repetition) and 97, 98, 99 and 100 are the ASCII code for each repeating characters.
c) Explain how the size of the following text string could be reduced using Run Length Encoding (RLE) algorithm.
'a a a a a a a a b b b b b b b b b b c d c d c d c c c c c c c c'
255 8 97 255 10 98 99 100 99 100 99 100 255 8 99
Q13. a) Describe how the following Black and White image could use Run Length Encoding (RLE) algorithm to reduce its size without loosing its quality.
9W 6B 2W 1B 7W 1B 7W 5B 3W 1B 7W 1B 7W 1B 6W
91 60 21 10 71 10 71 50 31 10 71 10 71 10 61
b) Describe how the following Coloured image could use Run Length Encoding (RLE) algorithm to reduce its size without loosing its quality.
Pixel colour | Red | Green | Blue |
0 | 0 | 0 | |
255 | 255 | 255 | |
0 | 255 | 0 | |
255 | 0 | 0 |
2 0 0 0 4 0 255 0 3 0 0 0 6 255 255 255 1 0 0 0 2 0 255 0 4 255 0 0 4 0 255 0
1 255 255 255 2 255 0 0 1 255 255 255 4 0 255 0 4 255 0 0 4 0 255 0 4 255 255 255
2 0 255 0 1 0 0 0 2 255 255 255 2 255 0 0 2 255 255 255 3 0 0 0 4 0 255 0 2 0 0 0
Q14. a) Describe how Lossy compression reduces the file size.
b) Describe how lossy compression is used to reduce an Image file size.
c) Describe how lossy compression is used to reduce a Sound file size.
d) Describe how lossy compression is used to reduce a Video file size.
e) Give reason why to choose lossy over lossless compression to compress image or audio files.
Q15. Explain how MP3 lossy compression algorithm reduces the audio file size retaining most of the original music quality.
⇒ MP3 file uses lossy compression that reduces its size by about 90%, by permanently removing the sounds that human ear cannot hear.
⇒ It deletes the background noise, retaining only the loud clear sound called perceptual music shaping.
⇒ MP3 algorithm further compresses the file using lossless compression by replacing repeated bits with shorter codes maintaining its quality.
Q16. Explain how MP4 lossy compression algorithm reduces the video file size retaining most of its original quality.
⇒ MP4 file uses lossy compression that reduces its size, by removing the pixel informations like colour shades and brightness variations which human eyes cannot interpret from its video frames.
⇒ It deletes the background noise and the sound which human ear cannot hear.
⇒ Only stores the data that have changed from one frame to the next.
⇒ Hence, the removed data will not affect the quality of the video.
Q17. Explain how JPEG lossy compression algorithm reduces an Image file size retaining most of its original quality.
⇒ JPEG file uses lossy compression that reduces its size, by removing the pixel informations like colour shades and brightness variations which human eyes cannot interpret.
⇒ It is done by separating pixel colour from its brightness, which then allows certain information to be discarded from the image without loosing any noticable image quality.
⇒ JPEG file cannot be reversed to regain its original bitmap image raw data.
Q18. a) What are Uncompressed Image file formats.
⇒ RAW images are images that are unprocessed and uncompressed that have been created by a camera or scanner. There are a lot of different raw formats (like, .raw, .cr2, .nef, .orf, .sr2, and more), each camera company often has its own proprietary format.
b) Give two Lossless compression Image file formats.
c) Give two Lossy compression Image file formats.
Q19. Give two difference between Lossless compression and Lossy compression.
Lossless compression Lossy compressionReduces the file size by replacing repeated data with shorter codes. Reduces the file size by permanently deleting the data without which the file could solve its purpose. File can be decompressed to its original state without losing any data. File cannot be regained to its original form one it is compressed.
REVISION : Statements and its key computing terms.
Reduction of the size of a file by removing repeated or redundant pieces of data; this can be lossy or lossless - | Compression |
The maximum rate of transfer of data across a network, measured in kilobits per second (Kbps) or megabits (Mbps) - | Bandwidth |
A file compression method that allows the original file can be fully restored during the decompression process, for example, run length encoding (RLE) - | Lossless file compression |
A method used to reduce the size of a sound file using perceptual music shaping - |
Audio compression |
A lossy file compression method used for music files - | MP3 |
A lossy file compression method used for multimedia files - | MP4 |
From Joint Photographic Expert Group; a form of lossy file compression used with image files which relies on the inability of the human eye to distinguish certain colour changes and hues - | JPEG |
A lossless file compression technique used to reduce the size of text and photo files in particular - | Run length encoding (RLE) |